Elastic Resource Provisioning for Batched Stream Processing System in Container Cloud
نویسندگان
چکیده
Batched stream processing systems achieve higher throughput than traditional stream processing systems while providing low latency guarantee. Recently, batched stream processing systems tend to be deployed in cloud due to their requirement of elasticity and cost efficiency. However, the performance of batched stream processing systems are hardly guaranteed in cloud because static resource provisioning for such systems does not fit for stream fluctuation and uneven workload distribution. In this paper, we propose EStream: an elastic batched stream processing system based on Spark Streaming, which transparently adjusts available resource to handle workload fluctuation and uneven distribution in container cloud. Specifically, EStream can automatically scale cluster when resource insufficiency or over-provisioning is detected under the situation of workload fluctuation. On the other hand, it conducts resource scheduling in cluster according to the workload distribution. Experimental results show that EStream is able to handle workload fluctuation and uneven distribution transparently and enhance resource efficiency, compared to original Spark Streaming.
منابع مشابه
Cost-efficient enactment of stream processing topologies
The continuous increase of unbound streaming data poses several challenges to established data stream processing engines. One of the most important challenges is the cost-efficient enactment of stream processing topologies under changing data volume. These data volume pose different loads to stream processing systems whose resource provisioning needs to be continuously updated at runtime. First...
متن کاملAn Elastic Data Stream Processing Ecosystem for Distributed Environments
In the last couple of years, we have observed a trend towards an ever-growing number and volume of data streams. Up to now, these data streams were mainly originating from social media services but today the emergence of the Internet of Things (IoT) also contributes to the growth of data streams. Besides the growth of the data volume, the IoT also introduces several new challenges, like the geo...
متن کاملOn the Cost-QoE Trade-off for Cloud Media Streaming under Amazon EC2 Pricing Models
Exponential growth of video traffic challenges the current paradigm to stream large amounts of video contents to end users. Cloud computing with elastic resource allocation supported enables cost-effective video streaming with desired QoE requirements. We abstract a new theoretical model from real systems for elastic media streaming by introducing a virtual content service provider that rents c...
متن کاملMaximum Sustainable Throughput Prediction for Large-Scale Data Streaming Systems
In cloud-based stream processing services, the maximum sustainable throughput (MST) is defined as the maximum throughput that a system composed of a fixed number of virtual machines (VMs) can ingest indefinitely. If the incoming data rate exceeds the system’s MST, unprocessed data accumulates, eventually making the system inoperable. Thus, it is important for the service provider to keep the MS...
متن کاملElastic Allocation of Docker Containers in Cloud Environments
Docker containers wrap up a piece of software together with everything it needs for the execution and enable to easily run it on any machine. For their execution in the Cloud, we need to identify an elastic set of virtual machines that can accommodate those containers, while considering the diversity of their requirements. In this paper, we briefly describe our formulation of the Elastic provis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017